A Spectral Based Clustering Algorithm for Categorical Data with Maximum Modularity
نویسندگان
چکیده
In this paper we propose a spectral based clustering algorithm to maximize an extended Modularity measure for categorical data; first, we establish the connection with the Relational Analysis criterion. Second, the maximization of the extended modularity is shown as a trace maximization problem. A spectral based algorithm is then presented to search for the partitions maximizing the extended Modularity criterion. Experimental results indicate that the new algorithm is efficient and effective at finding a good clustering across a variety of real-world data sets
منابع مشابه
Modularity and Spectral Co-Clustering for Categorical Data
To tackle the co-clustering problem on categorical data, we consider a spectral approach. We first define a generalized modularity measure for the co-clustering task. Then, we reformulate its maximization as a trace maximization problem. Finally we develop a spectral based co-clustering algorithm performing this maximization. The proposed algorithm is then capable to cluster rows and colunms si...
متن کاملNon-parametric latent modeling and network clustering
The paper exposes a non-parametric approach to latent and co-latent modeling of bivariate data, based upon alternating minimization of the Kullback-Leibler divergence (EM algorithm) for complete log-linear models. For categorical data, the iterative algorithm generates a soft clustering of both rows and columns of the contingency table. Well-known results are systematically revisited, and some ...
متن کاملClustering Categorical Data Using an Extended Modularity Measure
Newman and Girvan [12] recently proposed an objective function for graph clustering called the Modularity function which allows automatic selection of the number of clusters. Empirically, higher values of the Modularity function have been shown to correlate well with good graph clustering. In this paper we propose an extended Modularity measure for categorical data clustering; first, we establi...
متن کاملA Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm
Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...
متن کاملارائه یک الگوریتم خوشه بندی برای داده های دسته ای با ترکیب معیارها
Clustering is one of the main techniques in data mining. Clustering is a process that classifies data set into groups. In clustering, the data in a cluster are the closest to each other and the data in two different clusters have the most difference. Clustering algorithms are divided into two categories according to the type of data: Clustering algorithms for numerical data and clustering algor...
متن کامل